现在,基于视觉的本地化方法为来自机器人技术到辅助技术的无数用例提供了新出现的导航管道。与基于传感器的解决方案相比,基于视觉的定位不需要预安装的传感器基础架构,这是昂贵,耗时和/或通常不可行的。本文中,我们为特定用例提出了一个基于视觉的本地化管道:针对失明和低视力的最终用户的导航支持。给定最终用户在移动应用程序上拍摄的查询图像,该管道利用视觉位置识别(VPR)算法在目标空间的参考图像数据库中找到相似的图像。这些相似图像的地理位置用于采用加权平均方法来估计最终用户的位置和透视N点(PNP)算法的下游任务中,以估计最终用户的方向。此外,该系统实现了Dijkstra的算法,以根据包括Trip Origin和目的地的可通航地图计算最短路径。用于本地化和导航的层压映射是使用定制的图形用户界面构建的,该图形用户界面投影了3D重建的稀疏映射,从一系列图像构建到相应的先验2D楼平面图。用于地图构造的顺序图像可以在预映射步骤中收集,也可以通过公共数据库/公民科学清除。端到端系统可以使用带有自定义移动应用程序的相机安装在任何可互联网的设备上。出于评估目的,在复杂的医院环境中测试了映射和定位。评估结果表明,我们的系统可以以少于1米的平均误差来实现本地化,而无需了解摄像机的固有参数,例如焦距。
translated by 谷歌翻译
深度学习取得了长足的进步,用于图像中的对象检测。对象检测的检测准确性和计算成本取决于图像的空间分辨率,这可能会受到相机和存储注意事项的约束。压缩通常是通过减少空间或幅度分辨率或有时两者都对性能的众所周知的影响来实现的。检测精度还取决于感兴趣的对象与摄像机的距离。我们的工作研究了空间和振幅分辨率以及对象距离对物体检测准确性和计算成本的影响。我们开发了Yolov5(ra-Yolo)的分辨率 - 自适应变体,该变体基于输入图像的空间分辨率,它在特征金字塔和检测头中变化。为了训练和评估这种新方法,我们通过结合TJU和Eurocity数据集的图像来创建具有不同空间和振幅分辨率的图像数据集,并通过应用空间调整和压缩来生成不同的分辨率。我们首先表明Ra-Yolo在各种空间分辨率上实现了检测准确性和推理时间之间的良好权衡。然后,我们使用拟议的RA-YOLO模型评估空间和振幅分辨率对物体检测准确性的影响。我们证明,导致最高检测精度的最佳空间分辨率取决于“耐受性”图像大小。我们进一步评估了对象到摄像机对检测准确性的影响,并表明较高的空间分辨率可实现更大的检测范围。这些结果为选择图像空间分辨率和压缩设置提供了重要的指南,这些分辨率和压缩设置基于可用的带宽,存储,所需的推理时间和/或所需的检测范围,在实际应用中。
translated by 谷歌翻译
失明和低视力(PBLV)的人在定位最终目的地或针对陌生环境中的特定物体时面临重大挑战。此外,除了最初定位和定位目标对象外,从目前的立场接近最终目标通常是令人沮丧和挑战,尤其是当人们摆脱最初的计划途径以避免障碍时。在本文中,我们开发了一种新颖的可穿戴导航解决方案,以为用户提供实时指导,以便在不熟悉的环境中有效地接近感兴趣的目标对象。我们的系统包含两个关键的视觉计算函数:在3D中以3D为中的初始目标对象定位以及对用户轨迹的连续估计,这既基于由用户胸部前面安装在用户胸前的低成本单眼相机捕获的2D视频。这些功能使系统能够提出初始导航路径,在用户移动时不断更新路径,并及时提供有关用户路径校正的建议。我们的实验表明,我们的系统能够以室外和室内的误差小于0.5米的误差操作。该系统完全基于视觉,并且不需要其他传感器进行导航,并且可以使用可穿戴系统中的Jetson处理器进行计算以促进实时导航辅助。
translated by 谷歌翻译
视觉位置识别(VPR)不仅对于自动驾驶车辆的定位和映射至关重要,而且对于视力受损的人群的辅助导航至关重要。为了大规模启用长期VPR系统,需要解决一些挑战。首先,不同的应用程序可能需要不同的图像视图方向,例如自动驾驶汽车的前视图,而低视力人的侧视图。其次,由于行人和车辆身份信息的成像,大都市场景中的VPR通常会引起隐私问题,呼吁在VPR查询和数据库构建之前需要数据匿名化。这两个因素都可能导致VPR性能变化,而尚未得到很好的理解。 To study their influences, we present the NYU-VPR dataset that contains more than 200,000 images over a 2km by 2km area near the New York University campus, taken within the whole year of 2016. We present benchmark results on several popular VPR algorithms showing that对于当前的VPR方法,侧视观点明显更具挑战性,而数据匿名的影响几乎可以忽略不计,以及我们的假设解释和深入的分析。
translated by 谷歌翻译
Explainability is a vibrant research topic in the artificial intelligence community, with growing interest across methods and domains. Much has been written about the topic, yet explainability still lacks shared terminology and a framework capable of providing structural soundness to explanations. In our work, we address these issues by proposing a novel definition of explanation that is a synthesis of what can be found in the literature. We recognize that explanations are not atomic but the product of evidence stemming from the model and its input-output and the human interpretation of this evidence. Furthermore, we fit explanations into the properties of faithfulness (i.e., the explanation being a true description of the model's decision-making) and plausibility (i.e., how much the explanation looks convincing to the user). Using our proposed theoretical framework simplifies how these properties are ope rationalized and provide new insight into common explanation methods that we analyze as case studies.
translated by 谷歌翻译
Fruit is a key crop in worldwide agriculture feeding millions of people. The standard supply chain of fruit products involves quality checks to guarantee freshness, taste, and, most of all, safety. An important factor that determines fruit quality is its stage of ripening. This is usually manually classified by experts in the field, which makes it a labor-intensive and error-prone process. Thus, there is an arising need for automation in the process of fruit ripeness classification. Many automatic methods have been proposed that employ a variety of feature descriptors for the food item to be graded. Machine learning and deep learning techniques dominate the top-performing methods. Furthermore, deep learning can operate on raw data and thus relieve the users from having to compute complex engineered features, which are often crop-specific. In this survey, we review the latest methods proposed in the literature to automatize fruit ripeness classification, highlighting the most common feature descriptors they operate on.
translated by 谷歌翻译
Soft actuators have attracted a great deal of interest in the context of rehabilitative and assistive robots for increasing safety and lowering costs as compared to rigid-body robotic systems. During actuation, soft actuators experience high levels of deformation, which can lead to microscale fractures in their elastomeric structure, which fatigues the system over time and eventually leads to macroscale damages and eventually failure. This paper reports finite element modeling (FEM) of pneu-nets at high angles, along with repetitive experimentation at high deformation rates, in order to study the effect and behavior of fatigue in soft robotic actuators, which would result in deviation from the ideal behavior. Comparing the FEM model and experimental data, we show that FEM can model the performance of the actuator before fatigue to a bending angle of 167 degrees with ~96% accuracy. We also show that the FEM model performance will drop to 80% due to fatigue after repetitive high-angle bending. The results of this paper objectively highlight the emergence of fatigue over cyclic activation of the system and the resulting deviation from the computational FEM model. Such behavior can be considered in future controllers to adapt the system with time-variable and non-autonomous response dynamics of soft robots.
translated by 谷歌翻译
This paper presents a learning framework to estimate an agent capability and task requirement model for multi-agent task allocation. With a set of team configurations and the corresponding task performances as the training data, linear task constraints can be learned to be embedded in many existing optimization-based task allocation frameworks. Comprehensive computational evaluations are conducted to test the scalability and prediction accuracy of the learning framework with a limited number of team configurations and performance pairs. A ROS and Gazebo-based simulation environment is developed to validate the proposed requirements learning and task allocation framework in practical multi-agent exploration and manipulation tasks. Results show that the learning process for scenarios with 40 tasks and 6 types of agents uses around 12 seconds, ending up with prediction errors in the range of 0.5-2%.
translated by 谷歌翻译
病变分割是放射线工作流程的关键步骤。手动分割需要长时间的执行时间,并且容易发生可变性,从而损害了放射线研究及其鲁棒性的实现。在这项研究中,对非小细胞肺癌患者的计算机断层扫描图像进行了深入学习的自动分割方法。还评估了手动与自动分割在生存放射模型的性能中的使用。方法总共包括899名NSCLC患者(2个专有:A和B,1个公共数据集:C)。肺部病变的自动分割是通过训练先前开发的建筑NNU-NET进行的,包括2D,3D和级联方法。用骰子系数评估自动分割的质量,以手动轮廓为参考。通过从数据集A的手动和自动轮廓中提取放射性的手工制作和深度学习特征来探索自动分割对患者生存的放射素模型对患者生存的性能的影响。评估并比较模型的精度。结果通过平均2D和3D模型的预测以及应用后处理技术来提取最大连接的组件,可以实现具有骰子= 0.78 +(0.12)的自动和手动轮廓之间的最佳一致性。当使用手动或自动轮廓,手工制作或深度特征时,在生存模型的表现中未观察到统计差异。最好的分类器显示出0.65至0.78之间的精度。结论NNU-NET在自动分割肺部病变中的有希望的作用已得到证实,从而大大降低了时必的医生的工作量,而不会损害基于放射线学的生存预测模型的准确性。
translated by 谷歌翻译
在本文中,我们介绍了RISP,这是一种减少的指令尖峰处理器。虽然大多数尖峰神经处理器都是基于大脑或大脑的概念,但我们为简化而不是复杂的尖峰处理器提供了案例。因此,它具有离散的集成周期,可配置的泄漏等等。我们介绍了RISP的计算模型,并突出了其简单性的好处。我们展示了它如何帮助开发用于简单计算任务的手部神经网络,并详细介绍如何使用它来简化使用更复杂的机器学习技术构建的神经网络,并演示其与其他尖峰神经过程相似的性能。
translated by 谷歌翻译